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Abstract 

We consider the synthesis problem of Compressed Sensing -given s and an M x n 
matrix A, extract from it an m x n submatrix Am, certified to be s-good, with m 

• as small as possible. Starting from the verifiable sufficient conditions of s-goodness, 
. we express the synthesis problem as the problem of approximating a given matrix 

by a matrix of specified low rank in the uniform norm. We propose randomized 
algorithms for efficient construction of rank k approximation of matrices of size 

^ I m X n achieving accuracy bounds 0(1) y ^^™"^ which hold in expectation or with 

■ high probability. We also supply derandomized versions of the approximation algo- 

■ rithms which does not require random sampling of matrices and attains the same 
accuracy bounds. We further demonstrate that our algorithms are optimal up to the 
logarithmic in m, n factor, i.e. the accuracy of such an approximation for the iden- 

• tity matrix /„ cannot be better than 0{l)k~2. We provide preliminary numerical 
i results on the performance of our algorithms for the synthesis problem. 



1 Introduction 

Let A G M"*^"- be a matrix with m < n. Compressed Sensing focuses on recovery of a 
sparse signal x G from its noisy observations 

y = Ax + S, 



where 5 is an observation noise such that \\S\\ < e for certain known norm on and 
some given e. The standard recovering routine is 

X G Argmin{||w||i : \\Aw — y\\ < e.}. 
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We call the matrix A s-good if whenever the true signal x is s-sparse (i.e., has at most 
s nonzero entries) and there is no observation errors (e = 0), x is the unique optimal 
solution to the optimization program min{||w||i : Aw = Ax}. 

To the best of our knowledge, nearly the strongest verifiable sufficient condition for A 
to be s-good is as follows (cf [S]): 

There exists Y e M™""" such that - Y^A\\oo < — (1) 

2 s 

(here and in what follows ||^||oo = max \Xij\, Xij being the elements of X)Q 

In this paper we consider the synthesis problem of Compressed Sensing as follows: 

Given s and an M x n matrix A, extract from it an m x n submatrix A^, 
certified to be s-good, with m as small as possible. 

One can think, e.g., of a spatial or planar n-point grid S of possible locations of signal 
sources and an M-element grid S of possible locations of sensors. A sensor in a given 
location measures a known, depending on the location, linear form of the signals emitted 
at the nodes of S, and the goal is to place a given number m <^ M of sensors at the 
nodes of S in order to be able to recover the location of sources via the £i-minimization, 
conditioned that there are s sources at most. Since the property of s-goodness is difficult 
to verify, we will look for a submatrix of the original matrix A for which the s-goodness 
can be certified by the sufficient condition ([1]). Suppose that along with A we know an 
Mxn matrix Ym which certifies that the "level of goodness" of A is at least s, that is, 
we have 

||/n-n^A||oo</i<-^. (2) 

Then we can approach the synthesis problem as follows: 

Given Mxn matrices Ym and A and a tolerance e > 0, we want to extract 
from A, m rows (the smaller is m, the better) to get an m x n matrix Am 
which, along with properly chosen Ym G M™'^", satisfies the relation — 

Y"^ A W < f 

m II oo _ 

Choosing e < ^— /i and invoking ([2]), we ensure that the output Am of the above procedure 
is s-good. This simple observation motivates our interest to the problem of approximating 
a given matrix by a matrix of specified (low rank) in the uniform norm. 

Note that in the existing literature on low rank approximation of matrices the empha- 
sis is on efficient construction when the approximation error is measured in the Frobenius 

norm (for the Frobenius norm \\A\\p = i^ijAfA ). Though the Singular Value De- 
composition (SVD) gives the best rank k approximation in terms of all the norms that 



^ We address the reader to [5] for details concerning the derivation, the link to the necessary and 
sufficient condition of s-goodness and its comparison to traditional non- verifiable sufficient conditions for 
s-goodness based on Restricted Isometry or Restricted Eigenvalue Property and a verifiable sufficient 
condition based on mutual incoherence. 
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are invariant under rotation (e.g., the Frobenius norm and the spectral norm), its compu- 
tational cost may be prohibitive for applications involving large matrices. Recently, the 
properties of fast low rank approximations in the Frobenius norm based on the random- 
ized sampling of rows (or columns) of the matrix (see, e.g., [21 S]) or random sampling 
of a few individual entries (see [T] and references therein) has been studied extensively. 
Another randomized fast approximation based on the preprocessing by the Fast Fourier 
Transform or Fast Hadamard Transform has been studied in [6]. Yet we do not know 
explicit bounds available from the previous literature which concern numerically efficient 
low rank approximations in the uniform norm. 

In this work, we aim at developing efficient algorithms for building low rank approxi- 
mation of a given matrix in the uniform norm. Specifically, we consider two types of low 
rank approximations: 

1. Let W = Y'^A, where Y and A are known M x n matrices. We consider the 
approximation Wk = Yk-^ of ^ such that the matrices and A^ of dimension 
ruk X n, rrik < k < M, are composed of multiples of the rows of the matrices Y and 
A respectively. We show that a fast (essentially, of numerical complexity 0{kMn'^)) 
approximation Wk can be constructed which satisfies 



where L{Y,A) = J2i ||yi||oo||aj||oo and yf,aj denote the i-th rows of Y and A re- 
spectively. Note that for moderate values of L{Y,A) = 0(1) and k < n/2 this 
approximation is "quasi-optimal", as we know (cf, e.g. [5l Proposition 4.2]) that 
(for certain matrices W) the accuracy of such an approximation cannot be better 



2. Let A G M™^'^, A = MN^, where M G M'"'"^ and N G R""""^. We consider a fast 
approximation A^ = X]f=i ViCl" of ^) where rji and Q are linear combinations of 
columns of M and respectively. We show that this approximation satisfies 



where D is the maximal Euclidean norm of rows of M and N. We show that when 
A is an n X n identity matrix the above bound is unimprovable up to a logarithmic 
factor. 

In this paper we propose two types of construction of fast approximations: we consider 
the randomized construction, for which the accuracy bounds above hold in expectation 
(or with significant probability). We also supply "derandomized" versions of the approx- 
imation algorithms which does not require random sampling of matrices and attains the 
same accuracy bounds as the randomized method. 




oo 




than 0{k 
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2 Low rank approximation in Compressed Sensing 



In this section we suppose to be given s and an M x n matrix A and our objective is 
to extract from A a submatrix A^ which is composed of, at most, k rows of A, with as 
small k as possible, which is s-good. We assume that A admits a "goodness certificate" 
Y. Namely, we are given an M x n matrix Y such that 



fx 



\In-Y^A\\^ < 



1 

27' 



(3) 



and we are looking for Ak and the corresponding Yk such that ||/n — ^fc^^fell < ^• 



2.1 Random sampling algorithm 

The starting point of our developments is the following simple 
Lemma 2.1 Let for /3 > 0, let 



cosh 



1=1 



-(3\nd : 



X 



(4) 



Then 



(i) we have \\z\\oo — /31n[2(i] < Vp{z) < \\z\\oo; 

(ii) if 13^ < then Vp,{z) > Vp,{z); 

(Hi) function Vg is convex and continuously differentiahle on W^. Further, its gradient 
is Lipschitz- continuous with the constant 

\\V;{z^)-V;,{z2)h<^'\\zl-Z2\U (5) 

and ||V^^(^)||i < 1 for all z G M"'. 

For proof, see Appendix lAl 

Lemma [2.11 has the following immediate consequence: 

Proposition 2.1 Let f3 > f3' > (non-random) and let be random vectors in M.'^ 

such that E{^j|^i, = a.s., and E{||^j||^} < cr^^ < oo for a// i e {1, . . . , k}, and 
let Sk = Y!i=iik- Then 



(6) 



74 s a result. 



E{||5,|L}< 



(7) 



i=l 
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Proof. Let /3 > /?'. By applying items (ii) and (iii) of the lemma we get: 

When taking the expectation (first conditional to ^i, ^k-i), due to E{^fc|^i, = 
a.s., we obtain 



E{Vp{Sk)}<E{VpiSk^i)} + 



2/3' <E{^/^'(^'^-i)} + ^' 



which is (Ej). Now let us set (3' = (3 = \/ ^^f^^- Since Vj3{0) = we conclude that 



k 9 

0-f 



2/3 



On the other hand, by item (i) of Lemma I2.H 



E{||5fc||oo} < /31n[2d] +E{\/;3(5fc)} < /3 ln[2d] + < 



1=1 



2/3 -\ 



21n[2d]5^a| 



i=l 



□ 



The random sampling algorithm. Denoting and aj , i = 1, M, z-th rows of Y 
and A, respectively, let us set 

Oi = \\y-i\\oo\\ai\\oo, L = y^9i, 7ii = Y,Zi = —yi, (8) 
and let W = Y'^A. Observe that 



EZiT^i = 1, 7ri>0, l<i<M. 



L, 1 < i < M, 



(9) 



Now let S be random rank 1 matrix taking values Ziof with probabilities tTj, and let 
Hi, H2, ... be a sample of independent realizations of H. Consider the random matrix 



Wk 



1 ^' 



£=1 



Then Wk is, by construction, of the form YlFAk, where Ak is a random x n submatrix 
of A with ruk < fc. 

As an immediate consequence of Proposition 12.11 we obtain the following statement: 
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Proposition 2.2 One has 



E{||W^fc-W^||oo} <2Lfc-^/V21n(2n2). (10) 
In particular, the probability of the event 



S = {Si, Ek -.WWk- W\\oo < 4LA:-^/V21n[2n2]} 

is > 1/2, and whenever this event takes place, we have in our disposal a matrix Yk and a 
ruk X n submatrix Ak of A with rrik < k such that 



\\In - If ^fclloo < ||/n - W^iloc + \\Wk - W\\^ < /ifc := /i + ALk~'/^ ./2h^]. (11) 

Proof. By (|T9|) we have H^jof ||oo < L for all i, and besides this, treating i as random 
index distributed in {1, ...,n} according to probability distribution tt = {vTj}"^!, we have 
E{ziaf} = W. It follows that - W\\^ < 2L and E{S^ - W} = 0. If we denote 
Si = X]^=i('='f ~ when applying Lemma I^TT] we obtain 



E{||5fc||oo} < 2Lv/2A:ln[2n2], 
and we arrive at (fTOl). □ 



Discussion. Proposition 12.21 suggests a certain approach to the synthesis problem. In- 
deed, according to this Proposition, picking at random k rows a^, where ii, ...,ik are sam- 
pled independently from the distribution vr, we get with probability at least 1/2 a random 
rrik X n matrix A^, nif^ < k, which is provably s-good with s = 0{l){L^^/ln.[n]/k + /i)~^. 
When L = 0(1), this is nearly as good as it could be, since the sufficient condition for s- 
goodness stated in ([T]) can justify s-goodness of an m x tt, sensing matrix with n > 0{l)m 
only when s < 0{l)y/m, see [HI Proposition 4.2]. 



2.2 Derandomization 

Looking at the proof of Proposition 12. 1^ we see that the construction of A^ and can 
be derandomized. Indeed, (jS]) implies that 

Whenever S G R"^" and /3 > there exists i such that 

Vp{S + {z,aJ-W))<VpiS) + —. 

Specifically, the above bound is satisfied for every i such that 

{V;,{S),z,aJ-W)<0, 

and because tTj > and J2i '^ii^i^^J ~ = 0, the latter inequality is certainly 
satisfied for some i. 
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Now assume that given a sequence /3o > /3i > ••• of positive reals, we build a sequence of 
matrices Si according to the following rules: 

1. So = 0; 

2. Sk+i = Sk + (ffeflj^ - W) with 4 e {1, M} and Vk e M" such that 

V^&^,(^.+i)<V^A(^fc) + 5fc, ^fc<^- (12) 

Then for every A; > 1 the matrix Uk = k~^Sk is of the form Y^Aj. — W, where A^ is a 
ruk X n submatrix of A with < /c, and 

k-l 

||5A:||oc</3fclnM+^5,, 

whence 



<fi + k-' ^/3fcln[2n2] + ^5,j . 



In particular, for the choice (Si = -^y i^^^sj? ^ = 0, A;, we obtain 

ln^Afc-/„|U</i + 2L-/ ^ 



A; 

One can consider at least the following three (numerically efficient) policies for choosing 
Vk and ik satisfying ( fT2l) : we order them according to their computational complexity. 



A. Given Sk, we test one by one the options 4 = i, Vk = Zi, i = I, M, until an option 

satisfying ( fT2l) is met (or test all the n options and choose the one which results in 
the smallest Vg^^^ (6*^+1)). Note that accomplishing a step of this scheme requires 
O(Mn^) elementary operations. 

A'. In this version of A, we test the options ik = i, Vk = Zi when picking i at ran- 
dom, as independent realizations of the random variable i taking values 1,...,M 
with probabilities tTj, until an option with ly'p^{Sk), ziaj — W) < is met. Since 
E {(V^^(5'fe), ZiaJ — W)^ < 0, we may hope that this procedure will take essentially 
less steps than the ordered scan through the entire range 1, M of values of i. 

B. Given Sk we solve M one-dimensional convex optimization problems 

t* e Argmin Vp^{Sk + tz.aj -W), 1 < z < M, (13) 



then select the one, let its index be z*, with the smallest value of Vi3^{Sk+t*ZiaJ — W), 
and put Vk = t*^Zi^, ik = i*- 

If the bisection algorithm is used to find t*, solving the problem f|T3|) for one i to the 
relative accuracy e requires 0(n^ In e~^) elementary operations. The total numerical 
complexity of the step of the method is 0(Mn^ In e~^). 
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C. Given Sk-, we solve M convex optimization problems 

u* e Argmin Vp^{Sk + uaj -W), l<i<M, (14) 

then select the one, let its index be i^,, with the smallest value oiVi3^{Sk + u*af — W), 
and set = u*, ik = i*. 

Note that due to the structure of Vg to solve ( !T4|) it suffices to find a solution to 
the system 

ELi 7€sinh(aj^ + -feUj) = 0, 



Since the equations of the system i^E 
find the component Uj of the solution 



are independent, one can use bisection to 
Finding a solution to the relative accuracy 
e to each equation then requires 0(n In e~^) arithmetical operations, and the total 
complexity of solving (fT^ becomes 0(M?t,^ In e~^). 



Selecting Y and W. Note that the numerical schemes of this section should be initial- 
ized with matrices Y and W = Y'^A. We can do as follows: 

1. We start with solving the problem 

{M 
lki||oo||af lloo : \\In - Moo < 

where /i is a certain fraction of ^. Assuming the problem is feasible for the chosen 
fi, we get in this way the "initial point" - the matrix W = Y^A. 

2. Then we apply the outlined procedure to find A^ and Y^. At each step i of this 
procedure, we get certain mi x n submatrix Ai of A and a matrix Yi. When — 
Y^-^A^Iloo becomes less than ^ we terminate. Alternatively, we can solve at each 
step i an auxiliary problem min ||/„ — ?7^A^||oo and terminate when the optimal 

value in this problem becomes less than ^. 



Choosing the sequence (/S^). When the number k of steps of the iterative schemes 
of this section is fixed, the proof of Proposition 12.11 suggests the fixed choice of the "gain 

sequence" = L-^J^^^, i = 1, ...,k. When the number k is not known a priori, 

one can use the sequence, computed recursively according to the rule (3i = 



ln[2n2]/3f_ 



/So = i^[2n2] ; what is essentially the same, the sequence Pe = 2L^J^^^, ^ = 0, 1, .... 
Another possible choice of /3^'s is as follows: observe first that the function Va(2) is 

^Note that due to the convexity of the left-hand side of the equation in (fTSI) . even faster algorithm of 
Newton family can be used. 
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jointly convex in /3 and z. Therefore, we may modify the above algorithms by adding the 
minimization in /3. For instance, instead of the optimization problems (fT3|) in item B we 
can consider M two-dimensional optimization problems 

(t*, /?*) e Argmin {/3 Inpn^] + Vf^^S^ + tz,{A^Y^ -W)) , \ <%<M- 

t,i3m+ 

we select the one with the smallest value of the objective Vj3*{Sk + t*Ziaf — W) + (3* ln[2n^], 
and set, as before, = t*^Zi^, ik = i*- Note that such a modification does not increase 
significantly the complexity estimate of the scheme. 



2.3 Numerical illustration 

Here we report on preliminary numerical experiments with the synthesis problem as posed 
in the introduction. In our experiment, A is square, specifically, this is the Hadamard 
matrix Hu of order 2048. 

Recall that the Hadamard matrix ifj,, u = 0, 1, ... is a square matrix of order 
2" given by the recurrence 



H(] = 1, H. 



s+l 



s 



Hs —Hs 

whence Hy is a symmetric matrix with entries ±1 and H^Hy = 2'^l2<^. 

The goal of the experiment was to extract from A = Hu an m x 2048 submatrix Am 
which satisfies the relation (cf. ([1])) 

Opt(A„,) := min ||4 - Y^A^W^ n = 2048 (16) 

with s = 10; under this requirement, we would like to have m as small as possible. In 
Compressed Sensing terms, we are trying to solve the synthesis problem with A = Hn] in 
low rank approximation terms, we want to approximate /2048 in the uniform norm within 
accuracy < 0.05 by a rank m matrix of the form Y^Am, with the rows of A^ extracted 
from Hii. The advantages of the Hadamard matrix in our context is twofold: 

1. The error bound f lTOj) is proportional to the quantity L defined in (j8]). By the origin 
of this quantity, we clearly have ||y"^y4||oo < L, whence L>1 — yU>l — ^>l/2 
by Q. On the other hand, with A = Hy being an Hadamard matrix, setting 
Y = 2~"^YHy, so that Y^A — /2^, we ensure the validity of with /i = and get 
L = 1, that is, /i is as small as it could be, and L is nearly as small as it could be. 

2. Whenever A^ is a submatrix of Hy, the optimization problem in the left hand side 
of ffTB]) is easy to solve. 
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Item 2 deserves an explanation. Clearly, the optimization program in (fT6|) reduces to the 
series of n = 2048 LP programs 

Optj(A;„) = min \\ei - A^?/||oo, I < i < n, (17) 
where is the standard basic orth in M"; and Opt{Am) = maxOptj(Am). The point is 

i 

(for justification, see Appendix [B]) that when is an m x n subniatrix of the n x n 
Hadamard matrix, Opt^^Am) is independent of i, so that checking the inequality in f fT6l) 
requires solving a single LP program with m variables rather than solving n LO programs 
of the same size. 

The experiment was organized as follows. As it was already mentioned, we used i> = 11 
(that is, n = 2048) and s = 10 (that is, the desired uniform norm of approximating /2048 
by Y^Ara was 0.05). We compared two approximation policies: 

• "Blind" approximation - we choose a random permutation a{-) of the indices 
1, 2048 and look at the submatrices A^, k = 1, 2, ... obtained by extracting from 
Hii rows with indices a{l),a{2), ...,a{k) until a submatrix satisfying (!T6|) is met. 
This is a refinement of the Random sampling algorithm as applied to A = Hu and 
Y = 2~^^A, which results inW = /2048- The refinement is that instead of looking for 
approximation of W = J2048 of the form ^Yl'e=i^it^Iey where ii,i2,... are indepen- 
dent realizations of random variable t taking values 1, with equal probabilities 
(as prescribed by ([8]) in the case of A = i/^), we look for the best approximation of 
the form Y^A'', where A'' is the submatrix of A with the row indices cr{l), a{k). 

• "Active" approximation, which is obtained from algorithm A' by the same refine- 
ment as in the previous item. 

In our experiments, we ran every policy 6 times. The results were as follows: 

"Blind" policy B: the rank of 0.05-approximation of = /2048 varied from 662 to 
680. 

"Active" policy A: the rank of 0.05-approximation of W varied from 617 to 630. 
Note that in both algorithms the resulting matrix A^ is built "row by row", and the 
certified levels of goodness of the intermediate matrices ... are computed. In the 

below table we indicate, for the most successful (resulting in the smallest m) of the 6 
runs of each algorithm, the smallest values of k for which A'' was certified to be s-good, 
s = 1,2,..., 10: 



s 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


B 


15 


58 


121 


197 


279 


343 


427 


512 


584 


662 


A 


12 


47 


104 


172 


246 


323 


399 


469 


547 


617 



Finally, we remark that with A being the Hadamard matrix iJ^, the "no refinement" 
versions of our policies would terminate according to the criterion — -iA^AfcHoo < ^, 
which, on a closest inspection, is nothing but a slightly spoiled version of the goodness 
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test based on mutual incoherence [2Jf|. In the experiments we are reporting, this criterion 
is essentially weaker that the one based on (flGj) : for the best, over the 6 runs of the 
algorithms A and B, 10-good submatrices Am of Hu matrices we got the test based on 
mutual incoherence certifies the levels of goodness as low as 5 (in the case of B) and 7 (in 
the case of A). 



3 Low rank approximation of arbitrary matrices 



3.1 Randomized approximation 

Proposition 3.1 Let D > 0, and let P = [pj; G R"^""^ and Q = [gf; e 

be such that the Euclidean norms of the vectors pi and qj of P and Q are bounded by ^/D. 

Let an m X n matrix A be represented as 

A = PQ^ 

Given a positive integer k, consider the random matrix 



Ak = -P 

" k 



i=l 



1 ^ 

= lY.'^^cL m ■■= m = Pi., := m = 

i=l 



;i8) 



where ii ~ A/'(0,Jrf), i = l,...,k are independent standard normal random vectors from 
M'^. Then 



k > 81n(4mn) Prob{pfc - A\\^ < ^ ^ — ^—} > -. 



(19) 



For the proof, see Appendix [Ci 



3.2 The norm associated with Proposition 13.11 



Some remarks are in order. The result of Proposition 13.11 brings to our attention to the 
smallest D such that a given matrix A can be decomposed into the product PQ^ of 
two matrices with the Euclidean lengths of the rows not exceeding \/D. On the closest 
inspection, D turns out to be an easy-to-describe norm on the space M™^" of m x n 
matrices. Specifically, let \\A\\, A e M""^"-, be 



L4 = min <( t : 

t,M,N 



' M 


A ' 




N 



h 0,Mii < t yi,Nj.j < t Vj 



This relation clearly defines a norm, and one clearly has \\A\ 



■^Thc mutual incoherence test is as fohows: given a. kxn matrix _B = 6„] with nonzero columns, 

we compute the quantity /i(-B) ~ niax\bfbj\/bfbi and claim that B is s-good for all s such that s < 

^2ii{B) ■ With the Hadamard the "no refinement" criterion for our scheme is nothing but s < jji^a'^- 
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Proposition 3.2 For every A G M™^", there exists representation A = PQ with P G 
]^m.x(m.+n)^ Q ^ ]^nx(m+n) EucUdean norms of rows in P^Q not exceeding \J\\A\\. 

Vice versa, if A = PQ^ with the rows in P, Q of Euclidean norms not exceeding \/D, 
then \\A\\ < D. 

The next result summarizes the basic properties of the norm we have introduced. 

Proposition 3.3 Let A he an m x n matrix. Then 

(i) plloo < ||v4|| < v/min[m,n]||A||oo. 

(ii) ||y4|| < II A II 2,2; where ||A||2,2 'is the usual spectral norm of A (the maximal singular 
value). 

(iii) If A is symmetric positive semidefinite, then \\A\\ = ||A||oo- 

(iv) If the Euclidean norms of all rows (or all columns) of A are < D, then \\A\\ < D. 
For the proof, see Appendix [Dl 

3.3 Lower bound 

We have seen that if A G M™^", then the || ■ ||oo-error of the best in this norm approx- 
imation of A by a matrix of rank k is at most 0{l)^y\n[mn] \\A\\k~^/'^. We intend to 
demonstrate that in general this bound is unimprovable, up to a logarithmic in m and n 
factor. Specifically, the following result holds: 

Proposition 3.4 When n > 2k, the \\ 
In by a matrix of rank k is at least 



Note that ||/„|| = 1. 

Proof [cf. O Proposition 4.2]] Let a{n, k) be the minimal || ■ ||oo error of approximation 
of In by a matrix of rank < k; this function clearly is nondecreasing in n. Let u be 
an integer such that k < u < n, and A he an u x u matrix of rank < k such that 
11-^!^ — A\\oo = a '■= a{v,k). By variational characterization of singular values, at least 
V — k singular values oi 1^ — A are > 1, whence Tr([/^ — A\[I^ — A^) > v — k. On the 
other hand, ||/^ — A\\oq < a, whence Tr([Jj, — A][Ii, — A\^) < v^oP'. We conclude that 
oP' > for all u with k < u < n, whence > ^ when n > 2k. □ 

— w' — ' — 4fe — 

We have seen that when A G M*"^", A admits rank-fc approximations with the ap- 
proximation error, measured in the || ■ ||oo-norm, of order of ^y\n[mn ]||v4||A;-^/2. Note that 
the error bound deteriorates as \\A\\ grows. A natural question is, whether we could get 
similar results with a "weaker" norm of A as a scaling factor. Seemingly the best we could 
hope for is ||^||oo in the role of the scaling factor, meaning that whenever all entries of an 
m X n matrix A are in [—1, 1], A can be approximated in || ■ ||oo-iiorm by a matrix of rank 
k with approximation error which, up to a logarithmic in m, n factor, depends solely on 
k and goes to as /c goes to infinity. Unfortunately, the reality does not meet this hope. 
Specifically, let A = Hy be the nxn Hadamard matrix {n = 2^\ so that || A||oo = 1- Since 



II error of any approximation of the unit matrix 

(20) 

2v^ 
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H^H = nin, all n singular values of the matrix are equal to ^/n, whence for every n x n 
matrix B of rank k < n the Frobenius norm of A — i? is at least \Jn{n — k), meaning 
that the uniform norm of A — i? is at least -^/l — k/n. We conclude that the rank of a 
matrix which approximates A with || ■ ||oo-error < 1/4 should be of order of n. 
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A Proof of Lemma 2.1 



Properties (i) and (ii) are immediate consequences of the definition of Vp. Observe that 
Vj3 is convex and continuously different iable with 



d . 



__^Vp{x + th) 



Y,'!=i smh{xi/(3)hi 



Ei=iCosh(xi//3) 



whence ||V^(x)||i < 1 for x G M°'. Verification of ([5]) takes one line: is twice continu- 
ously differentiable with 



^2 y^fi co^h(r /B)h'^ Zli-i sinh(xi//3)/ii 



X;i=iCosh(xi//3) 



Ei=i cosh(xi//3) 



□ 
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B Problems (1171) in the case of Hadamard matrix A 



We claim that if Am is an m x 2*^ submatrix of the Hadamard matrix H,y of order n = 1" ^ 
then the optimal values in all problems ( fT71) are equal to each other. The explanation is 
a s follows. Let G be a finite abelian group of cardinality n. Recall that a character of 
G is a complex- valued function ^{g) such that ^(0) = 1 and ^{g + h) = ^{g)^{h) for all 
g,h & G; from this definition it immediately follows that = 1. The characters of 

a finite abelian group G form abelian group G*, the multiplication being the pointwise 
multiplication of functions, and this group is isomorphic to G. The Fourier Transform 
matrix associated with G is the nxn matrix with rows indexed by ^ G G^,, columns indexed 
by 5^ G G and entries ^{g)- For example, the usual DFT matrix of order n corresponds 
to the cyclic group G = Z„ := Z/nZ, while the Hadamard matrix ifj, is nothing but 
the Fourier Transform matrix associated with G = [^2]" (i^ ^his case, all characters take 
values ±1). For G G let eg{h) stands for the function on G which is equal to 1 at h = g 
and is equal to at h ^ g. Given an m-element subset Q of G*, consider the submatrix 
A = [^{g)]iieQ of the Fourier Transform matrix, along with n optimization problems 



9SG 

min - A^y]|U = minmax |gfj[e,(/z) -J^V^^m (P,) 

These problems clearly have equal optimal values, due to 

max meg{h) - E^eg Z/C^l^)]! = max |3f?[eo(/i - g) ~ J^^eQiVi^idMi^ - ^)]| 

= ^ max^ meoif) - E^^Qiv^imm 

As applied to G = Zg, this observation implies that all quantities given by f fT7|) are the 
same. 



C Proof of Proposition 13.1 



The reasoning to follow is completely standard. Let us fix i, 1 < i < m, and j , 1 < j < n, 
and let ^ ~ A/'(0,/d), /i = D-^/^pf^, v = D~^/^qJ^, and a = D~^Aij. Then [/i; z/] is a 
normal random vector with E{yU^} < 1, E{i^^} < 1 and E{fiu} = a. We can find a normal 
random vector z = [u]v] ~ A/'(0,/2) such that fi = au, u = bu + cv; note that < 1, 

ab ac/2 



ac/2 



6^ + < 1 and ab = E{/ii/} = a. Note that fiu = Bz with B = 
Denoting Ai, A2 the eigenvalues of B, we have 

Ai + A2 = Tr(fi) = a6 = a, + Xl = Tt{BB^) = a\b^ + cy2) < 1. (21) 

Now let 7 G M be such that I7I < 1/4. By (El]) we have I2-2B y 0, whence 

E{exp{7/iz/}} = E{exp{-f z'^Bz}} = Det~^/\h - 2-fB) = [(1 - 27Ai)(l - 27A2)]~^/^ . 
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Let t = ^y 8\n{Amn) and k > t, and let [nf, Ui], 1 < i < k,he independent random vectors 
with the same distribution as that of [/i; u]. Then for every 7 G (0, 1/4] we have 

k 

K+ := PToh{k[Ak]ij > D[ak + tk^^'^]} = Piohi^fieUe > ak + tk^'^} 

e=i 

k 

< E{exp{7^^^z/£}}exp{-7A;(a + fc-i/^t)} 
e=i 

= [E{exp{7/iz/}}]^ exp{-7A;(a + k~^/^t)} 

= [(1 - 27Ai)(l - 27A2)]-'/' exp{-7A;(a + k~'/H)}, 

so that 

In «:+ < ^ [-27(a + k-^^H) - ln(l - 27A1) - ln(l - 27A2)] 

< ^ [-27[Ai + A2] - 2^k^'^H - ln(l - 27A1) - ln(l - 27A2)] 

< ^[-2^k~'/H + 4^\Xl + Xl)] 

where the last inequality follows from |27As| < 1/2, for s = 1, 2, and — ln(l — r) — r < r"^ 
when |r| < 1/2. Using (12T1) we obtain, 

k 

In < - [~2-fk-^/H + 47^] . 
Setting 7 = ^py2 (this results inO<7<l/4 due to A;^/^ >t), we get 

PToh{k[Ak]ij > Ajk + Dtk^/^} = k+ < exp{-tV8} = (4mn)-^ 
Letting k_ = Proh{k[Ak]ij < A^k — Dk^^'^t}, we have 



< E{exp{-7^^i^z/J}exp{-7/c(-a + k'^^H)} 



K, _ 

for all 7 G (0, 1/4], whence, same as above 

Proh{k[Ak]ij < kA,j - Dk^'H} = k_ < (4 
We see that 

Prob{|[A,],, - A,,\ > Dtk^'/^} < 
Since this relation holds true for all i,j, we conclude that 



mn) ^. 



PToh{\\Ak- A\\^> Dk-^/h} < 1/2. D 
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D Proofs for section 13.2 



Proof of Proposition 13.21 First claim: tliere exist M, such that the matrix A 



M 



A 



N 



is positive semidefinite and has all diagonal entries, and then all entries, 



in [— Let A = BB^; then the rows in B have Euchdean norms 
Representing B = [P; Q] with m rows in P and n rows in g, the relation [P; Q] [P; = A 
implies that A = PQ^ . 

Second claim: Let A = PQ^ with the Euclidean norms of rows in P, Q not exceeding 



/D. Then ^ 



P 
Q 



P 

Q 



T 



- ppT 


A 







and = QQ do not exceed D. 



and the diagonal entries in M = PP^ 

□ 



Proof of Proposition [331 (i): The first inequality in (i) is evident. Let us prove the 
second. W.l.o.g. we can assume ||A||oo ^ 1- In this case our statement reads 



Opt := min < t : 

t,M,N 



' M 


A ' 




N 



hO,t-Mii> OVi, t - Njj > Vjj < D = v^min[^ 



m,n\. 



Assume, on the contrary, that Opt > D. Since the semidefinite problem defining Opt is 
strictly feasible, the dual problem 





' X 


z ' 




[ 


Y 



y- 



max < 



-2Tr(Z^A) : A > 0, p > 0, + P, = 1 

Tr(XM) + Tr(FAr) + E, A,(t - M„ 
+ j:,Pjit-N,,)^tyM,N,t 



has a feasible solution with value of the objective > D. In other words, there exist 
nonnegative vectors A G M™, p G and a matrix V = —Z G M™^" such that 



■ Diag{A} 


V 




Diag{p} 



^ 

n J ~ 

(b) E.A. + E,>. = i 

(c) 2Tr{V^A) > D. 

By (a), letting L = Diag{-\/Ai}, R = Diag{y^}, we have V = LWR with certain W, 
II 1 1 2,2 < 1 (II ■ II 2,2 is the usual matrix norm, the maximum singular value), thus 



2Tt{V^A) = 2Ti{RW^LA) < 2 E. , l[^^^^]iil =2ZLu\W,j\R^ 



= 2E.^..E, m,\Rn < 2||[|l^|].,,||2,2v/E;X|y^E,^|, 
^2 Vmin[m, n] ^(E, A,)(E, P,) < D, 
(*) 



(22) 
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where the concluding < is due to (b), and (*) is given by the following reasoning: w.l.o.g. 
we can assume that n < m. Since W is of the matrix norm < 1, the columns Uj of 

U = [\Wij\]ij satisfy \\Uj\\2 < 1, whence 



WUxh <J2\x,\\\U,h < y^\\xh\'x. 



i=l 



The resulting inequality in (122|) contradicts (c); we have arrived at a desired contradiction. 

(i) is proved. 

(ii) : This is evident, since 



2,2-^m 


A 




\\Ah2In . 



(iii): This is evident, since for Ay we have 



A 



>: 0. 
A ■ 



A 



A 



y 0. 



(iv): Since ||y4|| = it suffices to consider the case when the rows of A are of 

the norm not exceeding D. In this case, the result is readily given by the fact that 



■ D-^AA^ 


A 




Din _ 



y 0. 



□ 
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